3 research outputs found

    Memory hierarchy characterization of NoSQL applications through full-system simulation

    Get PDF
    In this work, we conduct a detailed memory characterization of a representative set of modern data-management software (Cassandra, MongoDB, OrientDB and Redis) running an illustrative NoSQL benchmark suite (YCSB). These applications are widely popular NoSQL databases with different data models and features such as in-memory storage. We compare how these data-serving applications behave with respect to other well-known benchmarks, such as SPEC CPU2006, PARSEC and NAS Parallel Benchmark. The methodology employed for evaluation relies on state-of-the-art full-system simulation tools, such as gem5. This allows us to explore configurations unattainable using performance monitoring units in actual hardware, being able to characterize memory properties. The results obtained suggest that NoSQL application behavior is not dissimilar to conventional workloads. Therefore, some of the optimizations present in state-of-the-art hardware might have a direct benefit. Nevertheless, there are some common aspects that are distinctive of conventional benchmarks that might be sufficiently relevant to be considered in architectural design. Strikingly, we also found that most database engines, independently of aspects such as workload or database size, exhibit highly uniform behavior. Finally, we show that different data-base engines make highly distinctive demands on the memory hierarchy, some being more stringent than others.This work was supported in part by the Spanish Government (Secretarıa de Estado de Investigacion, Desarrollo e Innovacion) under Grants TIN2015-66979-R and TIN2016-80512-R

    Nuevos protocolos de coherencia escalables para multiprocesadores en chip

    No full text
    RESUMEN: En esta tesis se lleva a cabo un análisis sobre la problemática asociada a la coherencia cache en el ámbito de los Multiprocesadores en chip (CMPs) y se presentan dos nuevas propuestas de protocolos de coherencia basada en hardware. Ambas propuestas van dirigidas a mitigar el coste asociado a la imperiosa necesidad de emplear jerarquías de memoria complejas dentro del chip que buscan superar la limitación del ancho de banda a memoria (bandwidth-wall). Así, por un lado, considerando como objetivo los sistemas multicore, compuestos por unas decenas de procesadores dentro del chip, se propone LOCKE, un protocolo de coherencia basado en broadcast y centrado en mejorar la reactividad de la jerarquía de memoria on-chip. Por otro lado, para futuros sistemas CMPs de gran escala que incluirán cientos o miles de procesadores, se propone MOSAIC, un protocolo escalable hibrido broadcast-directorio que logra disminuir significativamente el coste del mantenimiento de la coherencia hardware.ABSTRACT: This thesis includes an analysis of the problems associated with cache coherence in the field of chip multiprocessors (CMPs) and it introduces two new hardware-based coherence protocol proposals. Both proposals are focused on mitigating the associated cost brought by the necessity of having to use complex memory hierarchies inside the chip in order to face the memory bandwidth limitation (bandwidth-wall). On the one hand, considering as a target multicore systems with tens of processors within the chip, LOCKE is proposed. This proposal uses a broadcast-based approach, which focuses on improving the reactiveness of the on-chip memory hierarchy. On the other hand, for future large-scale CMPs which will include hundreds or thousands of processors, MOSAIC is proposed. This is a scalable hybrid coherence protocol (broadcast- and directory-based) that significantly reduces the maintenance costs of hardware coherence

    Sistema y método de mantenimiento de coherencia caché en arquitecturas multiprocesador y multinúcleo

    No full text
    Sistema y método de mantenimiento de coherencia caché en arquitecturas multiprocesador y multinúcleo. Se describe un sistema y un método que permiten mantener la coherencia caché en arquitecturas multiprocesador y multinúcleo mediante gestión de una serie de metadatos asociados a cada bloque de datos, de forma jerarquizada a nivel de núcleo, chip y sistema; denominados tokens. Para llevar a cabo el objeto de la invención, se implementa, asociado al último nivel de cache compartido en cada chip (LLC) una estructura D/F-LLC compuesta por un directorio y un filtro que contienen información sobre los bloques que están en la caches privadas de ese chip. Asimismo, asociado a cada controlador de memoria de cada chip, se implementa una estructura similar D/F-MEM con información sobre los bloques que están siendo utilizados por los diferentes chips.Solicitud: 201731343 (21.11.2017)Nº Pub. de Solicitud: ES2713579A1 (22.05.2019
    corecore